Overview

Dataset statistics

Number of variables40
Number of observations380932
Missing cells250345
Missing cells (%)1.6%
Duplicate rows11
Duplicate rows (%)< 0.1%
Total size in memory810.5 MiB
Average record size in memory2.2 KiB

Variable types

Text1
Categorical10
Numeric6
Boolean23

Alerts

Dataset has 11 (< 0.1%) duplicate rowsDuplicates
HeightInMeters is highly overall correlated with WeightInKilograms and 1 other fieldsHigh correlation
WeightInKilograms is highly overall correlated with HeightInMeters and 1 other fieldsHigh correlation
BMI is highly overall correlated with WeightInKilogramsHigh correlation
Sex is highly overall correlated with HeightInMetersHigh correlation
AgeCategory is highly overall correlated with PneumoVaxEverHigh correlation
PneumoVaxEver is highly overall correlated with AgeCategoryHigh correlation
HadHeartAttack is highly imbalanced (68.0%)Imbalance
HadAngina is highly imbalanced (66.5%)Imbalance
HadStroke is highly imbalanced (73.9%)Imbalance
HadSkinCancer is highly imbalanced (59.0%)Imbalance
HadCOPD is highly imbalanced (59.0%)Imbalance
HadKidneyDisease is highly imbalanced (72.6%)Imbalance
HadDiabetes is highly imbalanced (59.4%)Imbalance
DeafOrHardOfHearing is highly imbalanced (55.4%)Imbalance
BlindOrVisionDifficulty is highly imbalanced (68.7%)Imbalance
DifficultyDressingBathing is highly imbalanced (75.6%)Imbalance
DifficultyErrands is highly imbalanced (60.3%)Imbalance
HighRiskLastYear is highly imbalanced (74.2%)Imbalance
PhysicalHealthDays has 8988 (2.4%) missing valuesMissing
MentalHealthDays has 7420 (1.9%) missing valuesMissing
LastCheckupTime has 6793 (1.8%) missing valuesMissing
SleepHours has 4376 (1.1%) missing valuesMissing
RemovedTeeth has 9230 (2.4%) missing valuesMissing
ChestScan has 16179 (4.2%) missing valuesMissing
RaceEthnicityCategory has 10780 (2.8%) missing valuesMissing
AgeCategory has 6348 (1.7%) missing valuesMissing
HeightInMeters has 8631 (2.3%) missing valuesMissing
WeightInKilograms has 20361 (5.3%) missing valuesMissing
BMI has 25608 (6.7%) missing valuesMissing
AlcoholDrinkers has 4901 (1.3%) missing valuesMissing
HIVTesting has 18321 (4.8%) missing valuesMissing
PneumoVaxEver has 29798 (7.8%) missing valuesMissing
TetanusLast10Tdap has 34687 (9.1%) missing valuesMissing
PhysicalHealthDays has 229002 (60.1%) zerosZeros
MentalHealthDays has 226066 (59.3%) zerosZeros

Reproduction

Analysis started2023-11-20 14:31:43.616747
Analysis finished2023-11-20 14:33:47.046204
Duration2 minutes and 3.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

State
Text

Distinct54
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size23.7 MiB
2023-11-20T14:33:47.423008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length12
Mean length8.3455525
Min length4

Characters and Unicode

Total characters3179088
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAlabama
2nd rowAlabama
3rd rowAlabama
4th rowAlabama
5th rowAlabama
ValueCountFrequency (%)
new 30806
 
6.7%
washington 22388
 
4.9%
south 15350
 
3.4%
york 14586
 
3.2%
minnesota 13887
 
3.0%
ohio 13647
 
3.0%
maryland 13362
 
2.9%
virginia 13340
 
2.9%
carolina 12471
 
2.7%
texas 11979
 
2.6%
Other values (50) 294635
64.5%
2023-11-20T14:33:48.068703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 412273
13.0%
i 301451
 
9.5%
n 283027
 
8.9%
o 270327
 
8.5%
s 223560
 
7.0%
e 183171
 
5.8%
r 160799
 
5.1%
t 146492
 
4.6%
h 107351
 
3.4%
l 91189
 
2.9%
Other values (36) 999448
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2649785
83.4%
Uppercase Letter 453784
 
14.3%
Space Separator 75519
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 412273
15.6%
i 301451
11.4%
n 283027
10.7%
o 270327
10.2%
s 223560
8.4%
e 183171
 
6.9%
r 160799
 
6.1%
t 146492
 
5.5%
h 107351
 
4.1%
l 91189
 
3.4%
Other values (14) 470145
17.7%
Uppercase Letter
ValueCountFrequency (%)
M 76342
16.8%
N 48184
10.6%
W 39947
 
8.8%
C 39570
 
8.7%
I 32212
 
7.1%
O 24038
 
5.3%
A 22155
 
4.9%
V 21965
 
4.8%
D 16865
 
3.7%
T 16517
 
3.6%
Other values (11) 115989
25.6%
Space Separator
ValueCountFrequency (%)
75519
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3103569
97.6%
Common 75519
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 412273
13.3%
i 301451
 
9.7%
n 283027
 
9.1%
o 270327
 
8.7%
s 223560
 
7.2%
e 183171
 
5.9%
r 160799
 
5.2%
t 146492
 
4.7%
h 107351
 
3.5%
l 91189
 
2.9%
Other values (35) 923929
29.8%
Common
ValueCountFrequency (%)
75519
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3179088
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 412273
13.0%
i 301451
 
9.5%
n 283027
 
8.9%
o 270327
 
8.5%
s 223560
 
7.0%
e 183171
 
5.8%
r 160799
 
5.1%
t 146492
 
4.6%
h 107351
 
3.4%
l 91189
 
2.9%
Other values (36) 999448
31.4%

Sex
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.5 MiB
Female
201499 
Male
179433 

Length

Max length6
Median length6
Mean length5.0579263
Min length4

Characters and Unicode

Total characters1926726
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Female 201499
52.9%
Male 179433
47.1%

Length

2023-11-20T14:33:48.305926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:48.465409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
female 201499
52.9%
male 179433
47.1%

Most occurring characters

ValueCountFrequency (%)
e 582431
30.2%
a 380932
19.8%
l 380932
19.8%
F 201499
 
10.5%
m 201499
 
10.5%
M 179433
 
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1545794
80.2%
Uppercase Letter 380932
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 582431
37.7%
a 380932
24.6%
l 380932
24.6%
m 201499
 
13.0%
Uppercase Letter
ValueCountFrequency (%)
F 201499
52.9%
M 179433
47.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1926726
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 582431
30.2%
a 380932
19.8%
l 380932
19.8%
F 201499
 
10.5%
m 201499
 
10.5%
M 179433
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1926726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 582431
30.2%
a 380932
19.8%
l 380932
19.8%
F 201499
 
10.5%
m 201499
 
10.5%
M 179433
 
9.3%

GeneralHealth
Categorical

Distinct5
Distinct (%)< 0.1%
Missing957
Missing (%)0.3%
Memory size23.0 MiB
Very good
127352 
Good
122852 
Excellent
60315 
Fair
52426 
Poor
17030 

Length

Max length9
Median length4
Mean length6.4694651
Min length4

Characters and Unicode

Total characters2458235
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVery good
2nd rowExcellent
3rd rowExcellent
4th rowFair
5th rowPoor

Common Values

ValueCountFrequency (%)
Very good 127352
33.4%
Good 122852
32.3%
Excellent 60315
15.8%
Fair 52426
13.8%
Poor 17030
 
4.5%
(Missing) 957
 
0.3%

Length

2023-11-20T14:33:48.642565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:48.866142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
good 250204
49.3%
very 127352
25.1%
excellent 60315
 
11.9%
fair 52426
 
10.3%
poor 17030
 
3.4%

Most occurring characters

ValueCountFrequency (%)
o 534468
21.7%
d 250204
10.2%
e 247982
10.1%
r 196808
 
8.0%
V 127352
 
5.2%
y 127352
 
5.2%
127352
 
5.2%
g 127352
 
5.2%
G 122852
 
5.0%
l 120630
 
4.9%
Other values (9) 475883
19.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1950908
79.4%
Uppercase Letter 379975
 
15.5%
Space Separator 127352
 
5.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 534468
27.4%
d 250204
12.8%
e 247982
12.7%
r 196808
 
10.1%
y 127352
 
6.5%
g 127352
 
6.5%
l 120630
 
6.2%
t 60315
 
3.1%
n 60315
 
3.1%
c 60315
 
3.1%
Other values (3) 165167
 
8.5%
Uppercase Letter
ValueCountFrequency (%)
V 127352
33.5%
G 122852
32.3%
E 60315
15.9%
F 52426
13.8%
P 17030
 
4.5%
Space Separator
ValueCountFrequency (%)
127352
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2330883
94.8%
Common 127352
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 534468
22.9%
d 250204
10.7%
e 247982
10.6%
r 196808
 
8.4%
V 127352
 
5.5%
y 127352
 
5.5%
g 127352
 
5.5%
G 122852
 
5.3%
l 120630
 
5.2%
t 60315
 
2.6%
Other values (8) 415568
17.8%
Common
ValueCountFrequency (%)
127352
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2458235
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 534468
21.7%
d 250204
10.2%
e 247982
10.1%
r 196808
 
8.0%
V 127352
 
5.2%
y 127352
 
5.2%
127352
 
5.2%
g 127352
 
5.2%
G 122852
 
5.0%
l 120630
 
4.9%
Other values (9) 475883
19.4%

PhysicalHealthDays
Real number (ℝ)

MISSING  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing8988
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean4.3849908
Minimum0
Maximum30
Zeros229002
Zeros (%)60.1%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-20T14:33:49.074677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)3

Descriptive statistics

Standard deviation8.7420914
Coefficient of variation (CV)1.9936396
Kurtosis3.3427103
Mean4.3849908
Median Absolute Deviation (MAD)0
Skewness2.1640866
Sum1630971
Variance76.424162
MonotonicityNot monotonic
2023-11-20T14:33:49.280803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 229002
60.1%
30 28883
 
7.6%
2 21830
 
5.7%
1 14936
 
3.9%
3 13573
 
3.6%
5 13031
 
3.4%
10 9008
 
2.4%
7 7866
 
2.1%
15 7661
 
2.0%
4 7212
 
1.9%
Other values (21) 18942
 
5.0%
(Missing) 8988
 
2.4%
ValueCountFrequency (%)
0 229002
60.1%
1 14936
 
3.9%
2 21830
 
5.7%
3 13573
 
3.6%
4 7212
 
1.9%
5 13031
 
3.4%
6 2152
 
0.6%
7 7866
 
2.1%
8 1494
 
0.4%
9 333
 
0.1%
ValueCountFrequency (%)
30 28883
7.6%
29 309
 
0.1%
28 635
 
0.2%
27 162
 
< 0.1%
26 92
 
< 0.1%
25 1876
 
0.5%
24 99
 
< 0.1%
23 86
 
< 0.1%
22 118
 
< 0.1%
21 882
 
0.2%

MentalHealthDays
Real number (ℝ)

MISSING  ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing7420
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean4.4157323
Minimum0
Maximum30
Zeros226066
Zeros (%)59.3%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-20T14:33:49.423864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q35
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)5

Descriptive statistics

Standard deviation8.4040866
Coefficient of variation (CV)1.9032147
Kurtosis3.3020549
Mean4.4157323
Median Absolute Deviation (MAD)0
Skewness2.1097337
Sum1649329
Variance70.628671
MonotonicityNot monotonic
2023-11-20T14:33:49.615855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 226066
59.3%
30 23209
 
6.1%
2 20481
 
5.4%
5 17184
 
4.5%
10 13325
 
3.5%
3 13174
 
3.5%
15 12674
 
3.3%
1 12440
 
3.3%
20 7926
 
2.1%
4 6868
 
1.8%
Other values (21) 20165
 
5.3%
(Missing) 7420
 
1.9%
ValueCountFrequency (%)
0 226066
59.3%
1 12440
 
3.3%
2 20481
 
5.4%
3 13174
 
3.5%
4 6868
 
1.8%
5 17184
 
4.5%
6 1997
 
0.5%
7 6834
 
1.8%
8 1476
 
0.4%
9 260
 
0.1%
ValueCountFrequency (%)
30 23209
6.1%
29 418
 
0.1%
28 779
 
0.2%
27 206
 
0.1%
26 90
 
< 0.1%
25 2669
 
0.7%
24 104
 
< 0.1%
23 86
 
< 0.1%
22 165
 
< 0.1%
21 472
 
0.1%

LastCheckupTime
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing6793
Missing (%)1.8%
Memory size38.1 MiB
Within past year (anytime less than 12 months ago)
301086 
Within past 2 years (1 year but less than 2 years ago)
35527 
Within past 5 years (2 years but less than 5 years ago)
 
21212
5 or more years ago
 
16314

Length

Max length55
Median length50
Mean length49.311577
Min length19

Characters and Unicode

Total characters18449384
Distinct characters23
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWithin past year (anytime less than 12 months ago)
2nd rowWithin past year (anytime less than 12 months ago)
3rd rowWithin past year (anytime less than 12 months ago)
4th rowWithin past year (anytime less than 12 months ago)
5th rowWithin past year (anytime less than 12 months ago)

Common Values

ValueCountFrequency (%)
Within past year (anytime less than 12 months ago) 301086
79.0%
Within past 2 years (1 year but less than 2 years ago) 35527
 
9.3%
Within past 5 years (2 years but less than 5 years ago) 21212
 
5.6%
5 or more years ago 16314
 
4.3%
(Missing) 6793
 
1.8%

Length

2023-11-20T14:33:49.823337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:50.020654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
ago 374139
10.8%
within 357825
10.3%
past 357825
10.3%
less 357825
10.3%
than 357825
10.3%
year 336613
9.7%
anytime 301086
8.7%
12 301086
8.7%
months 301086
8.7%
years 151004
4.3%
Other values (6) 275898
7.9%

Most occurring characters

ValueCountFrequency (%)
3098073
16.8%
a 1878492
10.2%
t 1732386
9.4%
s 1525565
 
8.3%
n 1317822
 
7.1%
e 1162842
 
6.3%
h 1016736
 
5.5%
i 1016736
 
5.5%
y 788703
 
4.3%
o 707853
 
3.8%
Other values (13) 4204176
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13489133
73.1%
Space Separator 3098073
 
16.8%
Decimal Number 788703
 
4.3%
Close Punctuation 357825
 
1.9%
Uppercase Letter 357825
 
1.9%
Open Punctuation 357825
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1878492
13.9%
t 1732386
12.8%
s 1525565
11.3%
n 1317822
9.8%
e 1162842
8.6%
h 1016736
7.5%
i 1016736
7.5%
y 788703
5.8%
o 707853
 
5.2%
m 618486
 
4.6%
Other values (6) 1723512
12.8%
Decimal Number
ValueCountFrequency (%)
2 393352
49.9%
1 336613
42.7%
5 58738
 
7.4%
Space Separator
ValueCountFrequency (%)
3098073
100.0%
Close Punctuation
ValueCountFrequency (%)
) 357825
100.0%
Uppercase Letter
ValueCountFrequency (%)
W 357825
100.0%
Open Punctuation
ValueCountFrequency (%)
( 357825
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13846958
75.1%
Common 4602426
 
24.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1878492
13.6%
t 1732386
12.5%
s 1525565
11.0%
n 1317822
9.5%
e 1162842
8.4%
h 1016736
7.3%
i 1016736
7.3%
y 788703
 
5.7%
o 707853
 
5.1%
m 618486
 
4.5%
Other values (7) 2081337
15.0%
Common
ValueCountFrequency (%)
3098073
67.3%
2 393352
 
8.5%
) 357825
 
7.8%
( 357825
 
7.8%
1 336613
 
7.3%
5 58738
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18449384
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3098073
16.8%
a 1878492
10.2%
t 1732386
9.4%
s 1525565
 
8.3%
n 1317822
 
7.1%
e 1162842
 
6.3%
h 1016736
 
5.5%
i 1016736
 
5.5%
y 788703
 
4.3%
o 707853
 
3.8%
Other values (13) 4204176
22.8%
Distinct2
Distinct (%)< 0.1%
Missing805
Missing (%)0.2%
Memory size744.1 KiB
True
288481 
False
91646 
(Missing)
 
805
ValueCountFrequency (%)
True 288481
75.7%
False 91646
 
24.1%
(Missing) 805
 
0.2%
2023-11-20T14:33:50.190941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

SleepHours
Real number (ℝ)

MISSING 

Distinct24
Distinct (%)< 0.1%
Missing4376
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean7.0228226
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-20T14:33:50.371280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q16
median7
Q38
95-th percentile9
Maximum24
Range23
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4911012
Coefficient of variation (CV)0.2123222
Kurtosis8.0076209
Mean7.0228226
Median Absolute Deviation (MAD)1
Skewness0.69860711
Sum2644486
Variance2.2233827
MonotonicityNot monotonic
2023-11-20T14:33:50.560287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
7 113898
29.9%
8 106981
28.1%
6 82268
21.6%
5 25944
 
6.8%
9 18421
 
4.8%
4 10642
 
2.8%
10 9046
 
2.4%
3 2764
 
0.7%
12 2561
 
0.7%
2 1267
 
0.3%
Other values (14) 2764
 
0.7%
(Missing) 4376
 
1.1%
ValueCountFrequency (%)
1 911
 
0.2%
2 1267
 
0.3%
3 2764
 
0.7%
4 10642
 
2.8%
5 25944
 
6.8%
6 82268
21.6%
7 113898
29.9%
8 106981
28.1%
9 18421
 
4.8%
10 9046
 
2.4%
ValueCountFrequency (%)
24 34
 
< 0.1%
23 11
 
< 0.1%
22 13
 
< 0.1%
21 2
 
< 0.1%
20 113
< 0.1%
19 13
 
< 0.1%
18 143
< 0.1%
17 21
 
< 0.1%
16 258
0.1%
15 262
0.1%

RemovedTeeth
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing9230
Missing (%)2.4%
Memory size24.3 MiB
None of them
198338 
1 to 5
111375 
6 or more, but not all
39884 
All
22105 

Length

Max length22
Median length12
Mean length10.739972
Min length3

Characters and Unicode

Total characters3992069
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone of them
2nd rowNone of them
3rd row1 to 5
4th row1 to 5
5th rowNone of them

Common Values

ValueCountFrequency (%)
None of them 198338
52.1%
1 to 5 111375
29.2%
6 or more, but not all 39884
 
10.5%
All 22105
 
5.8%
(Missing) 9230
 
2.4%

Length

2023-11-20T14:33:50.759258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:50.900739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
none 198338
16.7%
of 198338
16.7%
them 198338
16.7%
1 111375
9.4%
to 111375
9.4%
5 111375
9.4%
all 61989
 
5.2%
6 39884
 
3.4%
or 39884
 
3.4%
more 39884
 
3.4%
Other values (2) 79768
6.7%

Most occurring characters

ValueCountFrequency (%)
818846
20.5%
o 627703
15.7%
e 436560
10.9%
t 389481
9.8%
n 238222
 
6.0%
m 238222
 
6.0%
N 198338
 
5.0%
f 198338
 
5.0%
h 198338
 
5.0%
l 123978
 
3.1%
Other values (9) 524043
13.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2650262
66.4%
Space Separator 818846
 
20.5%
Decimal Number 262634
 
6.6%
Uppercase Letter 220443
 
5.5%
Other Punctuation 39884
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 627703
23.7%
e 436560
16.5%
t 389481
14.7%
n 238222
 
9.0%
m 238222
 
9.0%
f 198338
 
7.5%
h 198338
 
7.5%
l 123978
 
4.7%
r 79768
 
3.0%
b 39884
 
1.5%
Other values (2) 79768
 
3.0%
Decimal Number
ValueCountFrequency (%)
1 111375
42.4%
5 111375
42.4%
6 39884
 
15.2%
Uppercase Letter
ValueCountFrequency (%)
N 198338
90.0%
A 22105
 
10.0%
Space Separator
ValueCountFrequency (%)
818846
100.0%
Other Punctuation
ValueCountFrequency (%)
, 39884
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2870705
71.9%
Common 1121364
 
28.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 627703
21.9%
e 436560
15.2%
t 389481
13.6%
n 238222
 
8.3%
m 238222
 
8.3%
N 198338
 
6.9%
f 198338
 
6.9%
h 198338
 
6.9%
l 123978
 
4.3%
r 79768
 
2.8%
Other values (4) 141757
 
4.9%
Common
ValueCountFrequency (%)
818846
73.0%
1 111375
 
9.9%
5 111375
 
9.9%
6 39884
 
3.6%
, 39884
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3992069
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
818846
20.5%
o 627703
15.7%
e 436560
10.9%
t 389481
9.8%
n 238222
 
6.0%
m 238222
 
6.0%
N 198338
 
5.0%
f 198338
 
5.0%
h 198338
 
5.0%
l 123978
 
3.1%
Other values (9) 524043
13.1%

HadHeartAttack
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing2377
Missing (%)0.6%
Memory size744.1 KiB
False
356585 
True
 
21970
(Missing)
 
2377
ValueCountFrequency (%)
False 356585
93.6%
True 21970
 
5.8%
(Missing) 2377
 
0.6%
2023-11-20T14:33:51.082272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadAngina
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing3647
Missing (%)1.0%
Memory size744.1 KiB
False
353890 
True
 
23395
(Missing)
 
3647
ValueCountFrequency (%)
False 353890
92.9%
True 23395
 
6.1%
(Missing) 3647
 
1.0%
2023-11-20T14:33:51.224033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadStroke
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1184
Missing (%)0.3%
Memory size744.1 KiB
False
362969 
True
 
16779
(Missing)
 
1184
ValueCountFrequency (%)
False 362969
95.3%
True 16779
 
4.4%
(Missing) 1184
 
0.3%
2023-11-20T14:33:51.352507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadAsthma
Boolean

Distinct2
Distinct (%)< 0.1%
Missing1377
Missing (%)0.4%
Memory size744.1 KiB
False
321806 
True
57749 
(Missing)
 
1377
ValueCountFrequency (%)
False 321806
84.5%
True 57749
 
15.2%
(Missing) 1377
 
0.4%
2023-11-20T14:33:51.495672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadSkinCancer
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing2573
Missing (%)0.7%
Memory size744.1 KiB
False
347198 
True
 
31161
(Missing)
 
2573
ValueCountFrequency (%)
False 347198
91.1%
True 31161
 
8.2%
(Missing) 2573
 
0.7%
2023-11-20T14:33:51.688190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadCOPD
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1726
Missing (%)0.5%
Memory size744.1 KiB
False
348041 
True
 
31165
(Missing)
 
1726
ValueCountFrequency (%)
False 348041
91.4%
True 31165
 
8.2%
(Missing) 1726
 
0.5%
2023-11-20T14:33:51.837253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing2222
Missing (%)0.6%
Memory size744.1 KiB
False
298652 
True
80058 
(Missing)
 
2222
ValueCountFrequency (%)
False 298652
78.4%
True 80058
 
21.0%
(Missing) 2222
 
0.6%
2023-11-20T14:33:51.982803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadKidneyDisease
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1490
Missing (%)0.4%
Memory size744.1 KiB
False
361533 
True
 
17909
(Missing)
 
1490
ValueCountFrequency (%)
False 361533
94.9%
True 17909
 
4.7%
(Missing) 1490
 
0.4%
2023-11-20T14:33:52.082176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing2092
Missing (%)0.5%
Memory size744.1 KiB
False
246483 
True
132357 
(Missing)
 
2092
ValueCountFrequency (%)
False 246483
64.7%
True 132357
34.7%
(Missing) 2092
 
0.5%
2023-11-20T14:33:52.193232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HadDiabetes
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing754
Missing (%)0.2%
Memory size21.9 MiB
No
314433 
Yes
53423 
No, pre-diabetes or borderline diabetes
 
9054
Yes, but only during pregnancy (female)
 
3268

Length

Max length39
Median length2
Mean length3.339733
Min length2

Characters and Unicode

Total characters1269693
Distinct characters25
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes
2nd rowNo
3rd rowNo
4th rowNo
5th rowYes

Common Values

ValueCountFrequency (%)
No 314433
82.5%
Yes 53423
 
14.0%
No, pre-diabetes or borderline diabetes 9054
 
2.4%
Yes, but only during pregnancy (female) 3268
 
0.9%
(Missing) 754
 
0.2%

Length

2023-11-20T14:33:52.426673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:52.608238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 323487
74.8%
yes 56691
 
13.1%
pre-diabetes 9054
 
2.1%
or 9054
 
2.1%
borderline 9054
 
2.1%
diabetes 9054
 
2.1%
but 3268
 
0.8%
only 3268
 
0.8%
during 3268
 
0.8%
pregnancy 3268
 
0.8%

Most occurring characters

ValueCountFrequency (%)
o 344863
27.2%
N 323487
25.5%
e 129873
 
10.2%
s 74799
 
5.9%
Y 56691
 
4.5%
52556
 
4.1%
r 42752
 
3.4%
d 30430
 
2.4%
b 30430
 
2.4%
i 30430
 
2.4%
Other values (15) 153382
12.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 809047
63.7%
Uppercase Letter 380178
29.9%
Space Separator 52556
 
4.1%
Other Punctuation 12322
 
1.0%
Dash Punctuation 9054
 
0.7%
Open Punctuation 3268
 
0.3%
Close Punctuation 3268
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 344863
42.6%
e 129873
 
16.1%
s 74799
 
9.2%
r 42752
 
5.3%
d 30430
 
3.8%
b 30430
 
3.8%
i 30430
 
3.8%
a 24644
 
3.0%
n 22126
 
2.7%
t 21376
 
2.6%
Other values (8) 57324
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
N 323487
85.1%
Y 56691
 
14.9%
Space Separator
ValueCountFrequency (%)
52556
100.0%
Other Punctuation
ValueCountFrequency (%)
, 12322
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9054
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3268
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3268
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1189225
93.7%
Common 80468
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 344863
29.0%
N 323487
27.2%
e 129873
 
10.9%
s 74799
 
6.3%
Y 56691
 
4.8%
r 42752
 
3.6%
d 30430
 
2.6%
b 30430
 
2.6%
i 30430
 
2.6%
a 24644
 
2.1%
Other values (10) 100826
 
8.5%
Common
ValueCountFrequency (%)
52556
65.3%
, 12322
 
15.3%
- 9054
 
11.3%
( 3268
 
4.1%
) 3268
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1269693
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 344863
27.2%
N 323487
25.5%
e 129873
 
10.2%
s 74799
 
5.9%
Y 56691
 
4.5%
52556
 
4.1%
r 42752
 
3.4%
d 30430
 
2.4%
b 30430
 
2.4%
i 30430
 
2.4%
Other values (15) 153382
12.1%

DeafOrHardOfHearing
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1388
Missing (%)0.4%
Memory size744.1 KiB
False
344325 
True
35219 
(Missing)
 
1388
ValueCountFrequency (%)
False 344325
90.4%
True 35219
 
9.2%
(Missing) 1388
 
0.4%
2023-11-20T14:33:52.763978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

BlindOrVisionDifficulty
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1178
Missing (%)0.3%
Memory size744.1 KiB
False
358295 
True
 
21459
(Missing)
 
1178
ValueCountFrequency (%)
False 358295
94.1%
True 21459
 
5.6%
(Missing) 1178
 
0.3%
2023-11-20T14:33:52.907404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing2540
Missing (%)0.7%
Memory size744.1 KiB
False
332695 
True
45697 
(Missing)
 
2540
ValueCountFrequency (%)
False 332695
87.3%
True 45697
 
12.0%
(Missing) 2540
 
0.7%
2023-11-20T14:33:53.065559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing1336
Missing (%)0.4%
Memory size744.1 KiB
False
317250 
True
62346 
(Missing)
 
1336
ValueCountFrequency (%)
False 317250
83.3%
True 62346
 
16.4%
(Missing) 1336
 
0.4%
2023-11-20T14:33:53.246013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

DifficultyDressingBathing
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing594
Missing (%)0.2%
Memory size744.1 KiB
False
364982 
True
 
15356
(Missing)
 
594
ValueCountFrequency (%)
False 364982
95.8%
True 15356
 
4.0%
(Missing) 594
 
0.2%
2023-11-20T14:33:53.444728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

DifficultyErrands
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1207
Missing (%)0.3%
Memory size744.1 KiB
False
349920 
True
 
29805
(Missing)
 
1207
ValueCountFrequency (%)
False 349920
91.9%
True 29805
 
7.8%
(Missing) 1207
 
0.3%
2023-11-20T14:33:53.915660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

SmokerStatus
Categorical

Distinct4
Distinct (%)< 0.1%
Missing2777
Missing (%)0.7%
Memory size26.2 MiB
Never smoked
227504 
Former smoker
104810 
Current smoker - now smokes every day
33183 
Current smoker - now smokes some days
 
12658

Length

Max length37
Median length12
Mean length15.307731
Min length12

Characters and Unicode

Total characters5788695
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNever smoked
2nd rowNever smoked
3rd rowCurrent smoker - now smokes some days
4th rowNever smoked
5th rowNever smoked

Common Values

ValueCountFrequency (%)
Never smoked 227504
59.7%
Former smoker 104810
27.5%
Current smoker - now smokes every day 33183
 
8.7%
Current smoker - now smokes some days 12658
 
3.3%
(Missing) 2777
 
0.7%

Length

2023-11-20T14:33:54.113688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:54.286781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
never 227504
23.1%
smoked 227504
23.1%
smoker 150651
15.3%
former 104810
10.6%
current 45841
 
4.7%
45841
 
4.7%
now 45841
 
4.7%
smokes 45841
 
4.7%
every 33183
 
3.4%
day 33183
 
3.4%
Other values (2) 25316
 
2.6%

Most occurring characters

ValueCountFrequency (%)
e 1108679
19.2%
r 712640
12.3%
607360
10.5%
o 587305
10.1%
m 541464
9.4%
s 495153
8.6%
k 423996
 
7.3%
d 273345
 
4.7%
v 260687
 
4.5%
N 227504
 
3.9%
Other values (9) 550562
9.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4757339
82.2%
Space Separator 607360
 
10.5%
Uppercase Letter 378155
 
6.5%
Dash Punctuation 45841
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1108679
23.3%
r 712640
15.0%
o 587305
12.3%
m 541464
11.4%
s 495153
10.4%
k 423996
 
8.9%
d 273345
 
5.7%
v 260687
 
5.5%
n 91682
 
1.9%
y 79024
 
1.7%
Other values (4) 183364
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
N 227504
60.2%
F 104810
27.7%
C 45841
 
12.1%
Space Separator
ValueCountFrequency (%)
607360
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 45841
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5135494
88.7%
Common 653201
 
11.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1108679
21.6%
r 712640
13.9%
o 587305
11.4%
m 541464
10.5%
s 495153
9.6%
k 423996
 
8.3%
d 273345
 
5.3%
v 260687
 
5.1%
N 227504
 
4.4%
F 104810
 
2.0%
Other values (7) 399911
 
7.8%
Common
ValueCountFrequency (%)
607360
93.0%
- 45841
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5788695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1108679
19.2%
r 712640
12.3%
607360
10.5%
o 587305
10.1%
m 541464
9.4%
s 495153
8.6%
k 423996
 
7.3%
d 273345
 
4.7%
v 260687
 
4.5%
N 227504
 
3.9%
Other values (9) 550562
9.5%

ECigaretteUsage
Categorical

Distinct4
Distinct (%)< 0.1%
Missing1519
Missing (%)0.4%
Memory size33.8 MiB
Never used e-cigarettes in my entire life
289772 
Not at all (right now)
69430 
Use them some days
 
10727
Use them every day
 
9484

Length

Max length41
Median length41
Mean length36.297939
Min length18

Characters and Unicode

Total characters13771910
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all (right now)
2nd rowNever used e-cigarettes in my entire life
3rd rowNever used e-cigarettes in my entire life
4th rowNever used e-cigarettes in my entire life
5th rowNever used e-cigarettes in my entire life

Common Values

ValueCountFrequency (%)
Never used e-cigarettes in my entire life 289772
76.1%
Not at all (right now) 69430
 
18.2%
Use them some days 10727
 
2.8%
Use them every day 9484
 
2.5%
(Missing) 1519
 
0.4%

Length

2023-11-20T14:33:54.527363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:54.677900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
never 289772
11.8%
used 289772
11.8%
e-cigarettes 289772
11.8%
in 289772
11.8%
my 289772
11.8%
entire 289772
11.8%
life 289772
11.8%
now 69430
 
2.8%
right 69430
 
2.8%
all 69430
 
2.8%
Other values (8) 219704
8.9%

Most occurring characters

ValueCountFrequency (%)
e 2678065
19.4%
2076985
15.1%
i 1228518
 
8.9%
t 1097817
 
8.0%
r 948230
 
6.9%
n 648974
 
4.7%
s 621209
 
4.5%
a 448843
 
3.3%
l 428632
 
3.1%
g 359202
 
2.6%
Other values (15) 3235435
23.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10886880
79.1%
Space Separator 2076985
 
15.1%
Uppercase Letter 379413
 
2.8%
Dash Punctuation 289772
 
2.1%
Open Punctuation 69430
 
0.5%
Close Punctuation 69430
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2678065
24.6%
i 1228518
11.3%
t 1097817
10.1%
r 948230
 
8.7%
n 648974
 
6.0%
s 621209
 
5.7%
a 448843
 
4.1%
l 428632
 
3.9%
g 359202
 
3.3%
m 320710
 
2.9%
Other values (9) 2106680
19.4%
Uppercase Letter
ValueCountFrequency (%)
N 359202
94.7%
U 20211
 
5.3%
Space Separator
ValueCountFrequency (%)
2076985
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 289772
100.0%
Open Punctuation
ValueCountFrequency (%)
( 69430
100.0%
Close Punctuation
ValueCountFrequency (%)
) 69430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11266293
81.8%
Common 2505617
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2678065
23.8%
i 1228518
10.9%
t 1097817
9.7%
r 948230
 
8.4%
n 648974
 
5.8%
s 621209
 
5.5%
a 448843
 
4.0%
l 428632
 
3.8%
g 359202
 
3.2%
N 359202
 
3.2%
Other values (11) 2447601
21.7%
Common
ValueCountFrequency (%)
2076985
82.9%
- 289772
 
11.6%
( 69430
 
2.8%
) 69430
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13771910
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2678065
19.4%
2076985
15.1%
i 1228518
 
8.9%
t 1097817
 
8.0%
r 948230
 
6.9%
n 648974
 
4.7%
s 621209
 
4.5%
a 448843
 
3.3%
l 428632
 
3.1%
g 359202
 
2.6%
Other values (15) 3235435
23.5%

ChestScan
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing16179
Missing (%)4.2%
Memory size744.1 KiB
False
207889 
True
156864 
(Missing)
 
16179
ValueCountFrequency (%)
False 207889
54.6%
True 156864
41.2%
(Missing) 16179
 
4.2%
2023-11-20T14:33:54.829149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

RaceEthnicityCategory
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing10780
Missing (%)2.8%
Memory size28.5 MiB
White only, Non-Hispanic
277070 
Hispanic
36482 
Black only, Non-Hispanic
29403 
Other race only, Non-Hispanic
 
18858
Multiracial, Non-Hispanic
 
8339

Length

Max length29
Median length24
Mean length22.70031
Min length8

Characters and Unicode

Total characters8402565
Distinct characters24
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhite only, Non-Hispanic
2nd rowWhite only, Non-Hispanic
3rd rowWhite only, Non-Hispanic
4th rowWhite only, Non-Hispanic
5th rowWhite only, Non-Hispanic

Common Values

ValueCountFrequency (%)
White only, Non-Hispanic 277070
72.7%
Hispanic 36482
 
9.6%
Black only, Non-Hispanic 29403
 
7.7%
Other race only, Non-Hispanic 18858
 
5.0%
Multiracial, Non-Hispanic 8339
 
2.2%
(Missing) 10780
 
2.8%

Length

2023-11-20T14:33:55.006403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:55.172909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
non-hispanic 333670
31.8%
only 325331
31.0%
white 277070
26.4%
hispanic 36482
 
3.5%
black 29403
 
2.8%
other 18858
 
1.8%
race 18858
 
1.8%
multiracial 8339
 
0.8%

Most occurring characters

ValueCountFrequency (%)
i 1034052
 
12.3%
n 1029153
 
12.2%
677859
 
8.1%
o 659001
 
7.8%
a 435091
 
5.2%
c 426752
 
5.1%
l 371412
 
4.4%
H 370152
 
4.4%
s 370152
 
4.4%
p 370152
 
4.4%
Other values (14) 2658789
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6019874
71.6%
Uppercase Letter 1037492
 
12.3%
Space Separator 677859
 
8.1%
Other Punctuation 333670
 
4.0%
Dash Punctuation 333670
 
4.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1034052
17.2%
n 1029153
17.1%
o 659001
10.9%
a 435091
7.2%
c 426752
7.1%
l 371412
 
6.2%
s 370152
 
6.1%
p 370152
 
6.1%
y 325331
 
5.4%
e 314786
 
5.2%
Other values (5) 683992
11.4%
Uppercase Letter
ValueCountFrequency (%)
H 370152
35.7%
N 333670
32.2%
W 277070
26.7%
B 29403
 
2.8%
O 18858
 
1.8%
M 8339
 
0.8%
Space Separator
ValueCountFrequency (%)
677859
100.0%
Other Punctuation
ValueCountFrequency (%)
, 333670
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 333670
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7057366
84.0%
Common 1345199
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1034052
14.7%
n 1029153
14.6%
o 659001
 
9.3%
a 435091
 
6.2%
c 426752
 
6.0%
l 371412
 
5.3%
H 370152
 
5.2%
s 370152
 
5.2%
p 370152
 
5.2%
N 333670
 
4.7%
Other values (11) 1657779
23.5%
Common
ValueCountFrequency (%)
677859
50.4%
, 333670
24.8%
- 333670
24.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8402565
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1034052
 
12.3%
n 1029153
 
12.2%
677859
 
8.1%
o 659001
 
7.8%
a 435091
 
5.2%
c 426752
 
5.1%
l 371412
 
4.4%
H 370152
 
4.4%
s 370152
 
4.4%
p 370152
 
4.4%
Other values (14) 2658789
31.6%

AgeCategory
Categorical

HIGH CORRELATION  MISSING 

Distinct13
Distinct (%)< 0.1%
Missing6348
Missing (%)1.7%
Memory size24.9 MiB
Age 65 to 69
41071 
Age 70 to 74
38192 
Age 60 to 64
38166 
Age 80 or older
31864 
Age 55 to 59
31423 
Other values (8)
193868 

Length

Max length15
Median length12
Mean length12.255195
Min length12

Characters and Unicode

Total characters4590600
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAge 80 or older
2nd rowAge 80 or older
3rd rowAge 40 to 44
4th rowAge 80 or older
5th rowAge 80 or older

Common Values

ValueCountFrequency (%)
Age 65 to 69 41071
10.8%
Age 70 to 74 38192
10.0%
Age 60 to 64 38166
10.0%
Age 80 or older 31864
8.4%
Age 55 to 59 31423
8.2%
Age 75 to 79 28661
7.5%
Age 50 to 54 28448
7.5%
Age 40 to 44 25065
 
6.6%
Age 45 to 49 24036
 
6.3%
Age 35 to 39 23980
 
6.3%
Other values (3) 63678
16.7%

Length

2023-11-20T14:33:55.435745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
age 374584
25.0%
to 342720
22.9%
65 41071
 
2.7%
69 41071
 
2.7%
70 38192
 
2.5%
74 38192
 
2.5%
60 38166
 
2.5%
64 38166
 
2.5%
80 31864
 
2.1%
or 31864
 
2.1%
Other values (19) 482446
32.2%

Most occurring characters

ValueCountFrequency (%)
1123752
24.5%
e 406448
 
8.9%
o 406448
 
8.9%
A 374584
 
8.2%
g 374584
 
8.2%
t 342720
 
7.5%
5 287767
 
6.3%
4 272897
 
5.9%
0 183403
 
4.0%
9 168025
 
3.7%
Other values (9) 649972
14.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1657656
36.1%
Decimal Number 1434608
31.3%
Space Separator 1123752
24.5%
Uppercase Letter 374584
 
8.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 287767
20.1%
4 272897
19.0%
0 183403
12.8%
9 168025
11.7%
6 158474
11.0%
7 133706
9.3%
3 91296
 
6.4%
2 60864
 
4.2%
8 55020
 
3.8%
1 23156
 
1.6%
Lowercase Letter
ValueCountFrequency (%)
e 406448
24.5%
o 406448
24.5%
g 374584
22.6%
t 342720
20.7%
r 63728
 
3.8%
l 31864
 
1.9%
d 31864
 
1.9%
Space Separator
ValueCountFrequency (%)
1123752
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 374584
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2558360
55.7%
Latin 2032240
44.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1123752
43.9%
5 287767
 
11.2%
4 272897
 
10.7%
0 183403
 
7.2%
9 168025
 
6.6%
6 158474
 
6.2%
7 133706
 
5.2%
3 91296
 
3.6%
2 60864
 
2.4%
8 55020
 
2.2%
Latin
ValueCountFrequency (%)
e 406448
20.0%
o 406448
20.0%
A 374584
18.4%
g 374584
18.4%
t 342720
16.9%
r 63728
 
3.1%
l 31864
 
1.6%
d 31864
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4590600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1123752
24.5%
e 406448
 
8.9%
o 406448
 
8.9%
A 374584
 
8.2%
g 374584
 
8.2%
t 342720
 
7.5%
5 287767
 
6.3%
4 272897
 
5.9%
0 183403
 
4.0%
9 168025
 
3.7%
Other values (9) 649972
14.2%

HeightInMeters
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct108
Distinct (%)< 0.1%
Missing8631
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean1.7025712
Minimum0.91
Maximum2.41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-20T14:33:55.617064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.91
5-th percentile1.52
Q11.63
median1.7
Q31.78
95-th percentile1.88
Maximum2.41
Range1.5
Interquartile range (IQR)0.15

Descriptive statistics

Standard deviation0.10717064
Coefficient of variation (CV)0.062946351
Kurtosis0.14597114
Mean1.7025712
Median Absolute Deviation (MAD)0.08
Skewness0.025750182
Sum633868.95
Variance0.011485547
MonotonicityNot monotonic
2023-11-20T14:33:55.801731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.68 32889
 
8.6%
1.63 31802
 
8.3%
1.7 30340
 
8.0%
1.65 29211
 
7.7%
1.78 28710
 
7.5%
1.73 27642
 
7.3%
1.75 25991
 
6.8%
1.6 25347
 
6.7%
1.83 25172
 
6.6%
1.57 24093
 
6.3%
Other values (98) 91104
23.9%
ValueCountFrequency (%)
0.91 18
< 0.1%
0.92 1
 
< 0.1%
0.95 1
 
< 0.1%
0.97 4
 
< 0.1%
0.99 1
 
< 0.1%
1 4
 
< 0.1%
1.02 2
 
< 0.1%
1.03 2
 
< 0.1%
1.04 18
< 0.1%
1.05 26
< 0.1%
ValueCountFrequency (%)
2.41 4
 
< 0.1%
2.36 1
 
< 0.1%
2.34 2
 
< 0.1%
2.29 5
 
< 0.1%
2.26 10
< 0.1%
2.24 1
 
< 0.1%
2.21 6
 
< 0.1%
2.18 6
 
< 0.1%
2.16 9
< 0.1%
2.13 20
< 0.1%

WeightInKilograms
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct584
Distinct (%)0.2%
Missing20361
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean83.217059
Minimum22.68
Maximum292.57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-20T14:33:55.985141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22.68
5-th percentile54.43
Q168.04
median81.19
Q395.25
95-th percentile122.47
Maximum292.57
Range269.89
Interquartile range (IQR)27.21

Descriptive statistics

Standard deviation21.485738
Coefficient of variation (CV)0.2581891
Kurtosis2.6537337
Mean83.217059
Median Absolute Deviation (MAD)13.15
Skewness1.0632404
Sum30005658
Variance461.63692
MonotonicityNot monotonic
2023-11-20T14:33:56.203199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90.72 18980
 
5.0%
81.65 17515
 
4.6%
68.04 15508
 
4.1%
72.57 15298
 
4.0%
77.11 14218
 
3.7%
86.18 12711
 
3.3%
63.5 11408
 
3.0%
79.38 10405
 
2.7%
99.79 9759
 
2.6%
74.84 9697
 
2.5%
Other values (574) 225072
59.1%
(Missing) 20361
 
5.3%
ValueCountFrequency (%)
22.68 8
< 0.1%
23 1
 
< 0.1%
23.13 1
 
< 0.1%
23.59 1
 
< 0.1%
24 1
 
< 0.1%
24.04 2
 
< 0.1%
24.49 1
 
< 0.1%
24.95 5
< 0.1%
25.4 3
 
< 0.1%
25.85 3
 
< 0.1%
ValueCountFrequency (%)
292.57 1
< 0.1%
290.3 1
< 0.1%
285 1
< 0.1%
281.68 1
< 0.1%
281 1
< 0.1%
280.32 1
< 0.1%
280 1
< 0.1%
278.96 1
< 0.1%
276.24 1
< 0.1%
274.42 1
< 0.1%

BMI
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct3887
Distinct (%)1.1%
Missing25608
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean28.586092
Minimum12.02
Maximum99.64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 MiB
2023-11-20T14:33:56.467612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.02
5-th percentile20.16
Q124.14
median27.44
Q331.84
95-th percentile40.72
Maximum99.64
Range87.62
Interquartile range (IQR)7.7

Descriptive statistics

Standard deviation6.5714124
Coefficient of variation (CV)0.22988146
Kurtosis4.1968959
Mean28.586092
Median Absolute Deviation (MAD)3.74
Skewness1.3606458
Sum10157324
Variance43.183461
MonotonicityNot monotonic
2023-11-20T14:33:56.693982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26.63 3757
 
1.0%
27.46 2945
 
0.8%
24.41 2858
 
0.8%
27.44 2787
 
0.7%
27.12 2738
 
0.7%
25.1 2435
 
0.6%
32.28 2176
 
0.6%
29.53 2081
 
0.5%
29.29 2067
 
0.5%
25.84 2062
 
0.5%
Other values (3877) 329418
86.5%
(Missing) 25608
 
6.7%
ValueCountFrequency (%)
12.02 1
 
< 0.1%
12.05 1
 
< 0.1%
12.06 1
 
< 0.1%
12.11 3
< 0.1%
12.16 4
< 0.1%
12.19 1
 
< 0.1%
12.21 3
< 0.1%
12.24 1
 
< 0.1%
12.27 3
< 0.1%
12.3 1
 
< 0.1%
ValueCountFrequency (%)
99.64 1
 
< 0.1%
97.65 4
< 0.1%
97.43 1
 
< 0.1%
96.2 1
 
< 0.1%
95.66 2
< 0.1%
94.66 1
 
< 0.1%
93.88 2
< 0.1%
93.51 1
 
< 0.1%
93.41 1
 
< 0.1%
92.73 1
 
< 0.1%

AlcoholDrinkers
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing4901
Missing (%)1.3%
Memory size744.1 KiB
True
196484 
False
179547 
(Missing)
 
4901
ValueCountFrequency (%)
True 196484
51.6%
False 179547
47.1%
(Missing) 4901
 
1.3%
2023-11-20T14:33:56.912436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HIVTesting
Boolean

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing18321
Missing (%)4.8%
Memory size744.1 KiB
False
239602 
True
123009 
(Missing)
 
18321
ValueCountFrequency (%)
False 239602
62.9%
True 123009
32.3%
(Missing) 18321
 
4.8%
2023-11-20T14:33:57.087537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing2705
Missing (%)0.7%
Memory size744.1 KiB
True
198580 
False
179647 
(Missing)
 
2705
ValueCountFrequency (%)
True 198580
52.1%
False 179647
47.2%
(Missing) 2705
 
0.7%
2023-11-20T14:33:57.244349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

PneumoVaxEver
Boolean

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing29798
Missing (%)7.8%
Memory size744.1 KiB
False
204991 
True
146143 
(Missing)
29798 
ValueCountFrequency (%)
False 204991
53.8%
True 146143
38.4%
(Missing) 29798
 
7.8%
2023-11-20T14:33:57.399870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

TetanusLast10Tdap
Categorical

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing34687
Missing (%)9.1%
Memory size33.9 MiB
No, did not receive any tetanus shot in the past 10 years
116598 
Yes, received tetanus shot but not sure what type
108419 
Yes, received Tdap
94904 
Yes, received tetanus shot, but not Tdap
26324 

Length

Max length57
Median length49
Mean length42.512813
Min length18

Characters and Unicode

Total characters14719849
Distinct characters24
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes, received tetanus shot but not sure what type
2nd rowNo, did not receive any tetanus shot in the past 10 years
3rd rowNo, did not receive any tetanus shot in the past 10 years
4th rowNo, did not receive any tetanus shot in the past 10 years
5th rowNo, did not receive any tetanus shot in the past 10 years

Common Values

ValueCountFrequency (%)
No, did not receive any tetanus shot in the past 10 years 116598
30.6%
Yes, received tetanus shot but not sure what type 108419
28.5%
Yes, received Tdap 94904
24.9%
Yes, received tetanus shot, but not Tdap 26324
 
6.9%
(Missing) 34687
 
9.1%

Length

2023-11-20T14:33:57.578308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-20T14:33:57.721137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
not 251341
 
8.8%
tetanus 251341
 
8.8%
shot 251341
 
8.8%
received 229647
 
8.1%
yes 229647
 
8.1%
but 134743
 
4.7%
tdap 121228
 
4.3%
10 116598
 
4.1%
years 116598
 
4.1%
no 116598
 
4.1%
Other values (9) 1024845
36.0%

Most occurring characters

ValueCountFrequency (%)
2497682
17.0%
e 1969757
13.4%
t 1590141
10.8%
s 1073944
 
7.3%
a 830782
 
5.6%
n 735878
 
5.0%
o 619280
 
4.2%
d 584071
 
4.0%
i 579441
 
3.9%
r 571262
 
3.9%
Other values (14) 3667611
24.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11148929
75.7%
Space Separator 2497682
 
17.0%
Uppercase Letter 467473
 
3.2%
Other Punctuation 372569
 
2.5%
Decimal Number 233196
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1969757
17.7%
t 1590141
14.3%
s 1073944
9.6%
a 830782
 
7.5%
n 735878
 
6.6%
o 619280
 
5.6%
d 584071
 
5.2%
i 579441
 
5.2%
r 571262
 
5.1%
u 494503
 
4.4%
Other values (7) 2099870
18.8%
Uppercase Letter
ValueCountFrequency (%)
Y 229647
49.1%
T 121228
25.9%
N 116598
24.9%
Decimal Number
ValueCountFrequency (%)
0 116598
50.0%
1 116598
50.0%
Space Separator
ValueCountFrequency (%)
2497682
100.0%
Other Punctuation
ValueCountFrequency (%)
, 372569
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11616402
78.9%
Common 3103447
 
21.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1969757
17.0%
t 1590141
13.7%
s 1073944
9.2%
a 830782
 
7.2%
n 735878
 
6.3%
o 619280
 
5.3%
d 584071
 
5.0%
i 579441
 
5.0%
r 571262
 
4.9%
u 494503
 
4.3%
Other values (10) 2567343
22.1%
Common
ValueCountFrequency (%)
2497682
80.5%
, 372569
 
12.0%
0 116598
 
3.8%
1 116598
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14719849
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2497682
17.0%
e 1969757
13.4%
t 1590141
10.8%
s 1073944
 
7.3%
a 830782
 
5.6%
n 735878
 
5.0%
o 619280
 
4.2%
d 584071
 
4.0%
i 579441
 
3.9%
r 571262
 
3.9%
Other values (14) 3667611
24.9%

HighRiskLastYear
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing1476
Missing (%)0.4%
Memory size744.1 KiB
False
362957 
True
 
16499
(Missing)
 
1476
ValueCountFrequency (%)
False 362957
95.3%
True 16499
 
4.3%
(Missing) 1476
 
0.4%
2023-11-20T14:33:57.942288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

CovidPos
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size372.1 KiB
False
270055 
True
110877 
ValueCountFrequency (%)
False 270055
70.9%
True 110877
29.1%
2023-11-20T14:33:58.103565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-11-20T14:33:36.964385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:30.466450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:31.676227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:33.067852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:34.303568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:35.566782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:37.212829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:30.651110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:31.911508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:33.259969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:34.471093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:35.785829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:37.438641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:30.857512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:32.142068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:33.490000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:34.671104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:36.258750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:37.664584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:31.011997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:32.348637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:33.672416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:34.894926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:36.448978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:37.906858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:31.247818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:32.599154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:33.930404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:35.142153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:36.617867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:38.096432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:31.493862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:32.855249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:34.134183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:35.388004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-11-20T14:33:36.789000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-11-20T14:33:58.313057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PhysicalHealthDaysMentalHealthDaysSleepHoursHeightInMetersWeightInKilogramsBMISexGeneralHealthLastCheckupTimePhysicalActivitiesRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
PhysicalHealthDays1.0000.312-0.084-0.0560.0590.0990.0640.3120.0410.2440.1180.1420.1540.1360.1350.0350.2240.2190.1430.2480.0940.1100.1560.2500.4390.3360.3440.0750.0310.1950.0210.0410.1320.0660.0240.1060.0270.0300.075
MentalHealthDays0.3121.000-0.152-0.0630.0070.0390.1000.1530.0370.1170.0520.0450.0380.0480.1300.0490.1040.4430.0390.0740.0300.0450.1080.3840.1610.1680.2520.0820.1040.0660.0290.0840.0580.1350.0630.0490.0300.1250.063
SleepHours-0.084-0.1521.000-0.012-0.066-0.0680.0270.1060.0350.1230.0690.0680.0510.0720.0740.0440.0970.1230.0590.0790.0420.0650.1050.1670.1640.1400.1640.0630.0480.0940.0510.0570.0870.0900.0690.0710.0260.0530.054
HeightInMeters-0.056-0.063-0.0121.0000.5030.0110.6660.0360.0510.0890.0400.0340.0270.0250.0590.0120.0480.0830.0320.0950.0400.0290.0500.0450.0870.0310.0710.0320.0320.0260.0670.0440.1220.0260.0590.0820.0450.0480.017
WeightInKilograms0.0590.007-0.0660.5031.0000.8460.3600.0950.0120.0960.0340.0380.0420.0140.0620.0370.0530.0620.0290.0670.0950.0260.0260.0460.1230.0880.0760.0400.0160.0730.0510.0650.0500.0470.0280.0300.0320.0130.067
BMI0.0990.039-0.0680.0110.8461.0000.1100.1240.0320.1550.0510.0290.0400.0170.1080.0460.0710.1180.0490.1200.1170.0210.0370.0780.1830.1100.1050.0330.0270.0650.0460.0640.0780.0440.0240.0140.0150.0230.069
Sex0.0640.1000.0270.6660.3600.1101.0000.0310.1070.0630.0150.0720.0580.0000.0770.0030.0320.1350.0140.1020.0890.0680.0220.0370.0720.0100.0700.0770.0620.0520.0400.0740.1050.0050.0690.0680.1070.0520.016
GeneralHealth0.3120.1530.1060.0360.0950.1240.0311.0000.0520.2950.1730.2030.2170.1780.1400.0370.2760.2190.1910.2710.1620.1460.1990.2800.4570.3400.3670.1030.0340.2490.0580.0770.1910.0480.0510.1440.0520.0060.011
LastCheckupTime0.0410.0370.0350.0510.0120.0320.1070.0521.0000.0380.0540.0710.0840.0600.0250.0790.0640.0300.0680.1690.0860.0590.0200.0130.1040.0400.0260.0600.0640.1560.0460.1530.0590.0200.2260.2070.0570.0580.023
PhysicalActivities0.2440.1170.1230.0890.0960.1550.0630.2950.0381.0000.1990.0860.0790.0820.0460.0050.1400.0810.0860.1280.1470.0760.0950.1090.2860.1720.1930.1180.0240.1020.0720.1270.1610.0230.0250.0530.1060.0210.014
RemovedTeeth0.1180.0520.0690.0400.0340.0510.0150.1730.0540.1991.0000.1760.1610.1390.0440.0570.2600.0710.1130.2520.1180.1540.1360.1140.2880.1450.1610.1660.0270.2240.0480.2210.1870.0260.0350.1750.0740.0480.063
HadHeartAttack0.1420.0450.0680.0340.0380.0290.0720.2030.0710.0860.1761.0000.4430.1850.0260.0530.1410.0280.1150.1240.1510.1040.0800.0540.1650.0870.0940.1020.0230.1740.0330.1870.0740.0150.0480.1200.0440.0220.021
HadAngina0.1540.0380.0510.0270.0420.0400.0580.2170.0840.0790.1610.4431.0000.1520.0360.0820.1590.0330.1490.1520.1580.1140.0740.0490.1760.0930.0940.0910.0330.1890.0500.2150.0660.0250.0800.1580.0320.0280.016
HadStroke0.1360.0480.0720.0250.0140.0170.0000.1780.0600.0820.1390.1850.1521.0000.0380.0430.1110.0480.0920.1070.1130.0830.0980.0880.1720.1120.1300.0670.0170.1410.0350.1470.0700.0040.0370.0910.0340.0150.021
HadAsthma0.1350.1300.0740.0590.0620.1080.0770.1400.0250.0460.0440.0260.0360.0381.0000.0000.2050.1530.0370.0970.0560.0280.0490.1110.1080.0750.0960.0350.0470.0860.0370.0570.0280.0760.0190.0890.0440.0310.046
HadSkinCancer0.0350.0490.0440.0120.0370.0460.0030.0370.0790.0050.0570.0530.0820.0430.0001.0000.0470.0130.0620.1270.0320.0840.0100.0190.0500.0120.0060.0700.0660.0950.1450.2600.0080.0640.1150.1680.0240.0420.033
HadCOPD0.2240.1040.0970.0480.0530.0710.0320.2760.0640.1400.2600.1410.1590.1110.2050.0471.0000.1260.0960.1840.1110.1100.1030.1220.2460.1510.1630.2330.0610.2060.0570.1590.0860.0300.0450.1640.0420.0100.012
HadDepressiveDisorder0.2190.4430.1230.0830.0620.1180.1350.2190.0300.0810.0710.0280.0330.0480.1530.0130.1261.0000.0520.1210.0560.0380.0870.3400.1520.1330.2090.1220.1400.0720.0630.1190.0280.1410.0170.0370.0560.0890.042
HadKidneyDisease0.1430.0390.0590.0320.0290.0490.0140.1910.0680.0860.1130.1150.1490.0920.0370.0620.0960.0521.0000.1320.1690.0780.0740.0530.1610.0880.1000.0440.0240.1280.0240.1450.0820.0020.0670.1310.0150.0190.008
HadArthritis0.2480.0740.0790.0950.0670.1200.1020.2710.1690.1280.2520.1240.1520.1070.0970.1270.1840.1210.1321.0000.1740.1520.0970.1020.3330.1530.1520.1370.0640.2310.1200.4000.0960.0270.1480.2720.0540.0650.036
HadDiabetes0.0940.0300.0420.0400.0950.1170.0890.1620.0860.1470.1180.1510.1580.1130.0560.0320.1110.0560.1690.1741.0000.0930.0990.0670.2250.1060.1100.0400.0290.1570.0420.1370.1530.0330.1010.1880.0310.0410.016
DeafOrHardOfHearing0.1100.0450.0650.0290.0260.0210.0680.1460.0590.0760.1540.1040.1140.0830.0280.0840.1100.0380.0780.1520.0931.0000.1340.1100.1750.0990.1000.0840.0200.1260.0580.2490.0520.0350.0570.1320.0510.0200.029
BlindOrVisionDifficulty0.1560.1080.1050.0500.0260.0370.0220.1990.0200.0950.1360.0800.0740.0980.0490.0100.1030.0870.0740.0970.0990.1341.0000.1700.2000.1540.2160.0690.0310.0890.0720.0810.0740.0270.0080.0420.0430.0110.007
DifficultyConcentrating0.2500.3840.1670.0450.0460.0780.0370.2800.0130.1090.1140.0540.0490.0880.1110.0190.1220.3400.0530.1020.0670.1100.1701.0000.2190.2040.3130.1240.1370.0860.0600.0940.0660.0900.0440.0110.0250.0900.026
DifficultyWalking0.4390.1610.1640.0870.1230.1830.0720.4570.1040.2860.2880.1650.1760.1720.1080.0500.2460.1520.1610.3330.2250.1750.2000.2191.0000.3920.3900.1290.0260.2200.0410.2600.1690.0050.0620.1750.0750.0330.036
DifficultyDressingBathing0.3360.1680.1400.0310.0880.1100.0100.3400.0400.1720.1450.0870.0930.1120.0750.0120.1510.1330.0880.1530.1060.0990.1540.2040.3921.0000.4210.0880.0270.1160.0390.0840.0850.0430.0040.0620.0330.0030.014
DifficultyErrands0.3440.2520.1640.0710.0760.1050.0700.3670.0260.1930.1610.0940.0940.1300.0960.0060.1630.2090.1000.1520.1100.1000.2160.3130.3900.4211.0000.1000.0610.1310.0350.0880.1180.0470.0020.0740.0410.0260.016
SmokerStatus0.0750.0820.0630.0320.0400.0330.0770.1030.0600.1180.1660.1020.0910.0670.0350.0700.2330.1220.0440.1370.0400.0840.0690.1240.1290.0880.1001.0000.1780.1550.0720.1420.0460.1220.1210.1210.0450.0740.047
ECigaretteUsage0.0310.1040.0480.0320.0160.0270.0620.0340.0640.0240.0270.0230.0330.0170.0470.0660.0610.1400.0240.0640.0290.0200.0310.1370.0260.0270.0610.1781.0000.0250.0420.1710.0710.1230.1300.0900.0060.1750.054
ChestScan0.1950.0660.0940.0260.0730.0650.0520.2490.1560.1020.2240.1740.1890.1410.0860.0950.2060.0720.1280.2310.1570.1260.0890.0860.2200.1160.1310.1550.0251.0000.0820.2870.0920.0460.0950.2280.0600.0250.005
RaceEthnicityCategory0.0210.0290.0510.0670.0510.0460.0400.0580.0460.0720.0480.0330.0500.0350.0370.1450.0570.0630.0240.1200.0420.0580.0720.0600.0410.0390.0350.0720.0420.0821.0000.1210.0890.1650.1200.1550.0700.0650.056
AgeCategory0.0410.0840.0570.0440.0650.0640.0740.0770.1530.1270.2210.1870.2150.1470.0570.2600.1590.1190.1450.4000.1370.2490.0810.0940.2600.0840.0880.1420.1710.2870.1211.0000.1480.3090.2950.5080.0920.2140.185
AlcoholDrinkers0.1320.0580.0870.1220.0500.0780.1050.1910.0590.1610.1870.0740.0660.0700.0280.0080.0860.0280.0820.0960.1530.0520.0740.0660.1690.0850.1180.0460.0710.0920.0890.1481.0000.0540.0070.0790.0840.0780.041
HIVTesting0.0660.1350.0900.0260.0470.0440.0050.0480.0200.0230.0260.0150.0250.0040.0760.0640.0300.1410.0020.0270.0330.0350.0270.0900.0050.0430.0470.1220.1230.0460.1650.3090.0541.0000.0450.0740.1210.1340.077
FluVaxLast120.0240.0630.0690.0590.0280.0240.0690.0510.2260.0250.0350.0480.0800.0370.0190.1150.0450.0170.0670.1480.1010.0570.0080.0440.0620.0040.0020.1210.1300.0950.1200.2950.0070.0451.0000.3440.1350.0650.068
PneumoVaxEver0.1060.0490.0710.0820.0300.0140.0680.1440.2070.0530.1750.1200.1580.0910.0890.1680.1640.0370.1310.2720.1880.1320.0420.0110.1750.0620.0740.1210.0900.2280.1550.5080.0790.0740.3441.0000.1260.0630.077
TetanusLast10Tdap0.0270.0300.0260.0450.0320.0150.1070.0520.0570.1060.0740.0440.0320.0340.0440.0240.0420.0560.0150.0540.0310.0510.0430.0250.0750.0330.0410.0450.0060.0600.0700.0920.0840.1210.1350.1261.0000.0150.052
HighRiskLastYear0.0300.1250.0530.0480.0130.0230.0520.0060.0580.0210.0480.0220.0280.0150.0310.0420.0100.0890.0190.0650.0410.0200.0110.0900.0330.0030.0260.0740.1750.0250.0650.2140.0780.1340.0650.0630.0151.0000.053
CovidPos0.0750.0630.0540.0170.0670.0690.0160.0110.0230.0140.0630.0210.0160.0210.0460.0330.0120.0420.0080.0360.0160.0290.0070.0260.0360.0140.0160.0470.0540.0050.0560.1850.0410.0770.0680.0770.0520.0531.000

Missing values

2023-11-20T14:33:38.788196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-20T14:33:40.483789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-11-20T14:33:45.310553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
0AlabamaFemaleVery good0.00.0Within past year (anytime less than 12 months ago)No8.0NaNNoNoNoNoNoNoNoNoNoYesNoNoNoNoNoNoNever smokedNot at all (right now)NoWhite only, Non-HispanicAge 80 or olderNaNNaNNaNNoNoYesNoYes, received tetanus shot but not sure what typeNoNo
1AlabamaFemaleExcellent0.00.0NaNNo6.0NaNNoNoNoNoYesNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 80 or older1.6068.0426.57NoNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
2AlabamaFemaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0NaNNoNoNoYesNoNoNoNoYesNoNoNoNoNoNoNoCurrent smoker - now smokes some daysNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicNaN1.6563.5023.30NoNoYesYesNo, did not receive any tetanus shot in the past 10 yearsNoNo
3AlabamaFemaleFair2.00.0Within past year (anytime less than 12 months ago)Yes9.0NaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 40 to 441.5753.9821.77YesNoNoYesNo, did not receive any tetanus shot in the past 10 yearsNoNo
4AlabamaMalePoor1.00.0Within past year (anytime less than 12 months ago)No7.0NaNYesNoYesNoNoNoNoNoNoYesNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 80 or older1.8084.8226.08NoNoNoYesNo, did not receive any tetanus shot in the past 10 yearsNoNo
5AlabamaFemaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes7.0NaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoFormer smokerNever used e-cigarettes in my entire lifeNoBlack only, Non-HispanicAge 80 or older1.6562.6022.96YesNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo
6AlabamaFemaleGood0.00.0Within past year (anytime less than 12 months ago)No8.0NaNNoNoNoNoNoNoNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 80 or older1.6373.4827.81NoNoYesYesYes, received tetanus shot but not sure what typeNoNo
7AlabamaFemaleGood0.00.0Within past year (anytime less than 12 months ago)Yes6.0NaNNoNoNoNoYesNoNoNoYesNoNoYesNoYesNoNoFormer smokerNot at all (right now)NaNWhite only, Non-HispanicAge 75 to 791.70NaNNaNNoYesNoNoYes, received tetanus shot but not sure what typeNoNo
8AlabamaFemaleGood1.00.0Within past year (anytime less than 12 months ago)Yes7.0NaNNoNoNoNoNoNoNoYesNoYesNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNaNWhite only, Non-HispanicAge 70 to 741.6881.6529.05YesNaNYesYesNo, did not receive any tetanus shot in the past 10 yearsNoNo
9AlabamaFemaleFair8.09.0Within past year (anytime less than 12 months ago)No8.0NaNNoNoNoNoNoNoNoNoNoNoNaNNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 80 or older1.6074.8429.23NoNoYesYesYes, received tetanus shot but not sure what typeNoNo
StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos
380922Virgin IslandsMaleFair10.00.0Within past 2 years (1 year but less than 2 years ago)YesNaN1 to 5NoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoCurrent smoker - now smokes some daysNever used e-cigarettes in my entire lifeNoBlack only, Non-HispanicAge 50 to 541.8090.7227.89YesYesNoNoNo, did not receive any tetanus shot in the past 10 yearsNoYes
380923Virgin IslandsMaleFairNaN0.0Within past year (anytime less than 12 months ago)Yes6.01 to 5NoNoNoNoNoYesYesNoYesNoNoNoNoNoNoNoFormer smokerNot at all (right now)YesBlack only, Non-HispanicAge 35 to 391.85104.3330.34NoYesNoNaNYes, received tetanus shot but not sure what typeNoYes
380924Virgin IslandsFemaleFair0.010.0Within past year (anytime less than 12 months ago)Yes6.01 to 5NoNoNoNoNoNoNoNoYesNoNoNoNoYesNoYesNever smokedNever used e-cigarettes in my entire lifeNaNBlack only, Non-HispanicAge 80 or older1.6588.4532.45YesNaNNoNoNaNNoYes
380925Virgin IslandsMaleGood14.00.0Within past year (anytime less than 12 months ago)YesNaNNone of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoFormer smokerNever used e-cigarettes in my entire lifeYesHispanicAge 30 to 341.8395.2528.48NoYesNoNoNo, did not receive any tetanus shot in the past 10 yearsNoYes
380926Virgin IslandsMaleFair30.01.0Within past year (anytime less than 12 months ago)No6.06 or more, but not allNoNaNYesNoNoYesNoNoNoNo, pre-diabetes or borderline diabetesNoNoNoNoNoNoFormer smokerNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 70 to 741.7870.3122.24NoNoYesNaNYes, received tetanus shot but not sure what typeNoYes
380927Virgin IslandsFemaleFair0.07.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoYesNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoBlack only, Non-HispanicAge 25 to 291.9390.7224.34NoNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoYes
380928Virgin IslandsMaleGood0.015.0Within past year (anytime less than 12 months ago)Yes7.01 to 5NoNoYesNoNoNoNoNoYesYesNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoMultiracial, Non-HispanicAge 65 to 691.6883.9129.86YesYesYesYesYes, received tetanus shot but not sure what typeNoYes
380929Virgin IslandsMaleGood0.00.0Within past 2 years (1 year but less than 2 years ago)Yes8.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 30 to 341.83104.3331.19YesNaNNoNoNaNNoYes
380930Virgin IslandsFemaleGood0.03.0Within past 2 years (1 year but less than 2 years ago)Yes6.0None of themNoNoNoYesNoNoYesNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesBlack only, Non-HispanicAge 18 to 241.6569.8525.63NaNYesNoNoNo, did not receive any tetanus shot in the past 10 yearsNoYes
380931Virgin IslandsMaleVery good0.00.0Within past year (anytime less than 12 months ago)No5.0None of themYesNoNoYesNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeYesBlack only, Non-HispanicAge 70 to 741.83108.8632.55NoYesYesYesNo, did not receive any tetanus shot in the past 10 yearsNoYes

Duplicate rows

Most frequently occurring

StateSexGeneralHealthPhysicalHealthDaysMentalHealthDaysLastCheckupTimePhysicalActivitiesSleepHoursRemovedTeethHadHeartAttackHadAnginaHadStrokeHadAsthmaHadSkinCancerHadCOPDHadDepressiveDisorderHadKidneyDiseaseHadArthritisHadDiabetesDeafOrHardOfHearingBlindOrVisionDifficultyDifficultyConcentratingDifficultyWalkingDifficultyDressingBathingDifficultyErrandsSmokerStatusECigaretteUsageChestScanRaceEthnicityCategoryAgeCategoryHeightInMetersWeightInKilogramsBMIAlcoholDrinkersHIVTestingFluVaxLast12PneumoVaxEverTetanusLast10TdapHighRiskLastYearCovidPos# duplicates
0ArizonaFemaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoYesNoNoNoYesNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 75 to 791.6356.7021.46YesNoYesYesYes, received TdapNoNo2
1MarylandFemaleGood0.00.0Within past year (anytime less than 12 months ago)Yes8.0None of themNoNoNoYesNoNoNoNoNoNoNoNoNoNoNoNoFormer smokerNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 65 to 691.6545.3616.64YesNoYesYesYes, received TdapNoNo2
2MarylandMaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes8.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 50 to 541.7565.7721.41YesNoYesYesYes, received TdapNoNo2
3MontanaMaleGood0.00.0Within past year (anytime less than 12 months ago)Yes5.0None of themNoNoNoNoNoNoNoNoNoYesYesNoNoNoNoNoFormer smokerNot at all (right now)NoNaNAge 50 to 541.7597.5231.75YesYesYesNoYes, received tetanus shot but not sure what typeNoNo2
4New JerseyFemaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoFormer smokerNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 50 to 54NaNNaNNaNYesNoYesNoYes, received tetanus shot but not sure what typeNoYes2
5New JerseyMaleGood0.00.0Within past year (anytime less than 12 months ago)No8.06 or more, but not allNoNoYesNoNoNoNoNoNoYesNoNoNoNoNoNoFormer smokerNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 75 to 791.6380.7430.55YesNoNoNoNo, did not receive any tetanus shot in the past 10 yearsNoNo2
6Rhode IslandFemaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes7.01 to 5NoNoNoYesNoNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 75 to 791.5768.0427.44YesNoYesYesNo, did not receive any tetanus shot in the past 10 yearsNoNo2
7South DakotaFemaleFair30.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoNoNoNoNoNoYesNoNoNoYesNoNoNever smokedNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 70 to 741.7590.7229.53NoNoYesYesNo, did not receive any tetanus shot in the past 10 yearsNoNo2
8VermontFemaleVery good0.00.0Within past year (anytime less than 12 months ago)Yes9.0None of themNoNoNoNoNoNoNoNoYesNoNoNoNoYesNoNoFormer smokerNever used e-cigarettes in my entire lifeYesWhite only, Non-HispanicAge 70 to 741.6579.3829.12YesNoYesYesYes, received tetanus shot but not sure what typeNoNo2
9WashingtonMaleExcellent0.00.0Within past year (anytime less than 12 months ago)Yes7.0None of themNoNoNoNoYesNoNoNoNoNoNoNoNoNoNoNoNever smokedNever used e-cigarettes in my entire lifeNoWhite only, Non-HispanicAge 60 to 641.8077.1123.71YesYesYesNoYes, received TdapNoNo2